Inaugural MAPS Dialogue on Military AI: Opportunities, Risks, and International Peace & Security Session 1 — Redacted Summary

April 4th, 2025

The following article represents a non-exhaustive summary of key elements emerging from the discussion held during the first session of the Military AI, Peace & Security (MAPS) Dialogues, a webinar series convened by the UN Office for Disarmament Affairs with the support of the Republic of Korea to foster inclusive multilateral dialogue on military applications of AI. Learn more about the event here. 

This inaugural MAPS Dialogue, Military AI: Opportunities, Risks and International Peace & Security, focused on the risks and potential opportunities of AI in the military domain and implications for peace & security, the importance of the human element in the full life-cycle of military AI, as well as the meaning of trustworthy AI, untangling concepts such as transparency, explainability, and reliability across the life cycle of AI applications. The event was open to representatives of States, international and regional organizations, civil society, the scientific community, and industry.  

1. Taking Stock: Opportunities and Risks for International Peace & Security 

Discussions highlighted two enabling features of AI that hold promise for the development of military applications. First, by speeding up information processing and sharing, AI technologies could facilitate communication across different domains (i.e., land, air, maritime, cyber, space), for instance, in augmenting early warning or crisis management, enhance multi-domain coordination and simultaneous engagement, and can create a ‘decision advantage’ through faster intelligence processing and decision-making. Second, by improving autonomy via emerging machine learning techniques, AI technologies could help maintain functionality in degraded or hostile environments and support personnel safety. 

At the same time, these potential benefits, panelists argued, come alongside new risks before, during, and after a conflict. It was noted that AI technologies may affect the calculus of war (i.e., lower the threshold to conflict) by reducing its cost (e.g., harm to combatants or civilians). 

There was further discussion on related dynamics that may increase the risk of miscalculations. For example, AI-enabled decision-making can compress the timeframe between attack and counterattack, accelerating the escalation cycle. Such rapid escalation can go beyond human cognition. Additionally, the paradigm shift in computing from deterministic to probabilistic computing further results in unpredictable outcomes. Probabilistic algorithms reduce the repeatability of outcomes (i.e., results change as the operating environment changes). It also reduces explainability of decisions to the human eye (also known as the black box problem). It was noted that, combined with the lack of representative battlefield data, these dynamics may undermine careful calibrations preventing conflict in volatile regions, where peace and stability often hinge on human judgment, decision-making, or back-channel communication. As adversarial attempts to confuse or poison AI systems expose the military enterprise to new vulnerabilities, it was suggested that “strategic patience” could help discern legitimate decisions from manipulation, including by non-state actors. Finally, concern was also expressed that the increased agency exhibited by AI systems during deployment may lead downstream users to shirk accountability—such as AI-enabled decision-support tools undermining humans’ ability to comply with their obligations under international law. It was noted that, as responsibility and accountability cannot be transferred to machines, operators may point fingers at earlier stages of the system’s lifecycle, particularly at technology developers. 

2. Reframing: False and missing dichotomies 

Panelists noted that two false dichotomies currently impact discussions around military applications of AI. It was argued that moving beyond false dichotomies clears space for overlooked issues.  

First, the view was expressed that the proliferation of terms to characterize AI (e.g., ethical, trusted, explainable, human-centric, sustainable, secure, responsible) and the lack of shared understanding of the terminology can create skepticism and mistrust among the users, especially on people that are affected by the use of AI. Some argued that efforts should move from stringing together new adjectives to forging robust working definitions and operationalizing existing ones – human-centric, accountable, safe, secure, and trustworthy. In this regard, a panelist provided a working definition of trustworthy AI, citing Institute for Defence Analysis: (i) when employed correctly, it will dependably do well what it is intended to do, (ii) when employed correctly, it will dependably not do undesirable things, (iii) when paired with the humans it is intended to work with, it will dependably be employed correctly.2 

Second, it was argued that discussions should move beyond the false dichotomy according to which governance efforts come at the expense of innovation. It was suggested that governance actually helps translate human values into operationalizable technological requirements, signal clarity across policy, industry, and military stakeholders, and march to the same beat towards robust – rather than fragile – innovation. 

There was also discussion on terminology and conceptualization. For instance, it was suggested that there is no one “military context” for AI and that risks and opportunities would change depending on its integration into specific context (e.g., AI’s integration into logistics versus identification of targets would merit different risk/opportunity calculation). Additionally, it was noted that speaking of opportunities broadly conflates military advantage – speed or decision advantage – with a humanitarian advantage (e.g., compliance with international humanitarian law (IHL) and the protection of civilians). A decoupling would reveal a more nuanced reality: increased operational tempo on the battlefield could generate additional risks for civilians or miscalculations. 

3. Governing: What States should prioritize in developing governance frameworks 

Panelists addressed various aspects of ongoing consideration of potential governance frameworks of military applications of AI. 

It was noted that identifying tasks for which AI systems can be used and tasks for which they are inherently ill-suited is an important starting point. For example, command-and-control systems for nuclear weapons were cited as an out-of-scope application. 

When appropriateness of applications is less clear, experts noted, States should combine object-level – technological, human-centric, semantic – and process-level assessment. In other words, it was suggested that States consider mechanisms to refine the systems themselves, their ability to work with and for humans dependably, the language used to describe them, and the fora in which these discussions happen. 

  1. Technological approaches 

At the object level, trustworthy AI systems can be achieved through assurance. A panelist clarified that assurance is not based on a “gut feeling”—but rather it requires gathering available evidence to build a case by measuring, evaluating, and communicating the level of confidence needed by a specific stakeholder for a specific application in a specific context. 

In a military context, assurance was described as both internal – pure technical performance – and external – broader operational, legal, ethical, and societal considerations. Internally, an effective means to justify levels of technical confidence is through testing, evaluation, verification, and validation (TEVV) methods, which are engineering-centric processes that ensure technical correctness and reliability. Externally, the panelists noted that comprehensive risk assessment practices can proactively identify potential hazards, estimate their likelihood and impact, and decide how to mitigate or accept them.  

Several practices were presented as supporting assurance. Both within and outside defense organizations, technologists and operational personnel should collaborate in realistic environments to identify pitfalls before deployment. Assurance was also presented as a trust-building mechanism, as sharing methodologies and findings signals that proper due diligence is carried out. Additionally, it was noted that by implementing trustworthiness by design, early interventions would become more manageable, efficient, and less resource-intensive than leaving it as an afterthought. 

  1. Human-centric approaches 

The UNGA resolution 79/239 emphasizes the human element in three key contexts: it calls for AI to be human-centric, underscores the necessity of maintaining human judgment and control over the use of force, and affirms that AI applications must align with established normative frameworks, such as International Humanitarian Law and International Human Rights Law. 

From a socio-technical standpoint, panelists suggested that States can create a combination of controls to ensure that humans exercise proper judgment and maintain control over effects. A recurring theme throughout the panel was that systems must be designed and developed to work with and for humans, rather than in place of them, especially for critical decisions. 

Many expressed the view that legal determinations, for instance, must remain human judgments, as only humans – not systems – can be held accountable under IHL. While decision support tools (such as collateral damage calculation tools) can inform human choices, a panelist highlighted that these statistical outputs are inputs to a legal assessment enabler, not the assessment itself – rejecting suggestions to ‘code IHL’ into AI-enabled systems or leverage assessment tools, for instance for distinction or proportionality. To work effectively for humans, some suggested that judgment inhibitors and cognitive tendencies, such as over-trust or automation bias, must be integrated from early in the life cycle.  

The view was also expressed that States should pragmatically draw from relevant discussions in other fields, such as cyberspace and civilian self-driving vehicles. It was noted that accountability drives responsibility, which influences behavior: involving stakeholders in the chain of responsibility may motivate them to innovate in new ways. 

  1. Process-level approaches 

The panel highlighted that identifying the right setting and modalities of discussion will affect the breadth, depth, wealth, length, and strength of the conversation and its outcomes. It was noted, for example, that multi-stakeholder dialogues create transparency and trust at different levels. 

The panelists noted that focused discussions yield meaningful operational outcomes. States were encouraged to transition from high-level considerations to tailored frameworks, whether application-based, risk-based, or other approaches. It was argued that a more concentrated focus could help enhance shared understanding and galvanize attention to how specific characteristics of particular technologies used in specific ways challenge our existing legal, normative, and operational frameworks and practices.